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Detecting community structure in real-world networks is a challenging problem. Re- 
cently, it has been shown that the resolution of methods based on optimizing a modular- 
ity measure or a corresponding energy function is limited; communities with sizes below 
some threshold remain unresolved. One possibility to go around this problem is to vary 
the threshold by using a tuning parameter, and investigate the community structure at 
variable resolutions. Here, we analyze the resolution limit and multiresolution behavior 
for two different methods: a g-state Potts method proposed by Reichardt and Bornholdt, 
and a recent multiresolution method by Arenas, Fernandez, and Gomez. These methods 
are studied analytically, and applied to three test networks using simulated annealing. 
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Networks consisting of nodes and links are an efficient way to represent and 
study a large variety of technological, biological and social complex systems [TJ [5] . 
Usually the functionality of these systems is of central interest, which, on turn, 
is closely related to the structure of the corresponding networks. In particular, 
substructures called modules or communities are abundant in networks. These 
communities are, loosely speaking, groups of nodes that are densely interconnected 
but only sparsely connected with the rest of the network ~ consider, e.g., 

groups of individuals interacting with each other in social networks, or functional 
modules in metabolic networks. As communities are supposed to play a special 
role in the often stochastic dynamics of the systems under consideration, their 
identification is crucial. Thus, reliable and computationally tractable methods for 
detecting them in empirical networks are required. 

Several methods and algorithms have been developed for community detec- 
tion pj [5] • One popular class of methods is based on optimizing a global quality 
function called modularity [9], or a closely related Hamiltonian [10], which con- 
tains the modularity as a special case. The related methods are computationally 



demanding, especially for large networks, but various approximative algorithms ex- 
ist [HI H2 EH EH E] . For many test networks, these methods have been shown to 
perform well [jj [15] . However, it has recently been shown that the resolution of 
the modularity based methods is intrinsically limited, that is, modularity optimiza- 
tion fails to find small communities in large networks - instead, small groups of 
connected nodes turn out merged as larger communities [16] . For the Hamiltonian- 
based method, there is also a resolution limit due to similar underlying reasons [T7] 
though this method contains a tuning parameter which can be used to study com- 
munities of different sizes. Recently, Arenas et at proposed a modification of the 
modularity optimization method which also provides a parameter that can be used 
to probe the community structure at different resolutions. Here, we compare these 
two methods and their resolutions analytically, pointing out similarities and differ- 
ences. Subsequently we apply them to several test networks using optimization by 
simulated annealing. 

We start by briefly reviewing the concept of modularity, introduced by Newman 
and Girvan [5]. The modularity Q is defined as follows 



where K is the degree sum of the network, l ss is the number of links in community 
s, [lss] = K^/2K is the expected number of links inside community s, given that the 
network is random, and K s is the sum of the degrees of nodes in community s. In 
modularity optimization, the goal is to assign all nodes into communities such that 
Q is maximized. 

The Hamiltonian-based method introduced by Reichardt and Bornholdt (RB) 
is based on considering the community indices of nodes as spins in a g-state Potts 
model, such that if the energy of such as system is minimized, groups of nodes with 
dense internal connections should end up having parallel spins [lOj . The Hamilto- 
nian for the system is defined as follows: 



where [/ ss ]pi,- is the expected number of links in community s, given the null model 
Pij, and 7 is a tunable parameter. Minimizing TL defines the community structure. 
When 7=1, Eq. ((2]) becomes Eq. fl} apart from a constant factor. Hence the RB 
method contains the modularity optimization as a special case, and can be viewed in 
a more general framework. Changing 7 allows to explore the community structure 
at different resolutions, but communities with large differences in size cannot be 
simultaneously detected using a single value of 7 [IT] . 

Recently Arenas, Fernandez and Gomez (AFG) proposed a method [l8j for 
augmenting modularity optimization with a parameter, which similarly to 7 above 
allows tuning the resolution of the method. This approach considers the network 
to be weighted. The trick introduced by Arenas et al. 18J is to add a self-link of 
weight r to each node, in which case the modularity becomes 
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where W(r) is total link weight in the network (including self-links), w ss (r) is total 
link weight inside community s and [w ss ] is its expected value. Parameter r adjusts 
the total weight in the network, which in turn changes the community detection 
resolution [IS]. Sweeping r and observing which communities are most stable with 
respect to changes in r should reveal the community structure. 

Eqs. @ and J3} suggest that RB and AFG methods are somewhat related, al- 
though not equal. The tuning parameters, 7 and r, behave qualitatively in the same 
way: large parameter values allow finding small communities, and small values yield 
large communities. In fact, in the RB method, the effect of 7 in Eq.((2]) can be inter- 
preted such that the "effective" number of links in the network equals L/7, whereas 
the parameter r in Eq. §3§ changes the total weight in the network. However, there 
is a difference: r also increases the sum of weights within a community, whereas 7 
has no effect on the number of links within a community. In order to illustrate the 
connection between these methods, we next derive the "resolution limit" intrinsic 
for Eq. © in the AFG method. 

Now suppose that a network consists of "physical" communities, which are some- 
how known to us. We consider two of these communities, s and t, such that 
the sum of weights of edges connecting them is w st . If these "physical" com- 
munities are merged by the detection method, the modularity Q w (r) changes by 
AQ w (r) = yy 1 ^ {w s t — [w s t(r)]). The optimization of modularity should merge 
these communities if AQ w (r) > 0, which yields 



where S s (r) is the total node strength in community s. An analogous result for RB 
method is "fK s K t < 2Ll stl where K s is total node degree in community s. Hence 
the tuning parameters 7 and r are not identical, and they affect the optimization 
outcome differently. However, if we assume that S s — St ~ n s {s), n s = n t and 
K s ks n s (k) Eq. ((4J reduces to 



which bears resemblance to the corresponding RB result: n s < Nl 8t / {^(k)). 

Next, we present some numerical results obtained by sweeping the tuning param- 
eters 7 and r of the RB and AFG methods across a range of values, and optimizing 
the respective energy functions using simulated annealing. Three different test net- 
works are used. We show the behavior of the number of communities detected by 
the methods as a function of the tuning parameter, and look for "stable" regions 
where this number remains constant [18] . Earlier, community structures detected 
using several values of 7 in the RB method have been reported in |I0j . but to our 
knowledge complete sweeps and stability analysis have not been reported earlier. 

Our first test network is a synthetic, hierarchical scale-free network of TV = 125 
nodes . This unweighted network can be viewed to consist of 5 communities of 25 
nodes each, which can be further divided into five-node cliques (for a visualization 
of this network, see [19] or [18]). Figure QJ A) shows the number of communities 
detected using the RB and AFG methods. Both methods are able to reveal the 



S s (r)S t (r) < 2W{r)w st , 
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Figure 1. Number of communities as detected with simulated annealing using the RB (upper) and 
AFG (lower) methods. A: hierarchical scale-free network 19 of 125 nodes, B: Zachary's karate 
club. The vertical line denotes the traditional modularity optimization case. 



large communities at small values of sweeping parameter, although the AFG method 
seems to perform slightly better. One should note that this might be a feature of 
the numerical optimization, and not the method itself. We remind the reader that 
the "traditional" modularity optimization corresponds to 7 = 1 and r = 0. These 
points are shown in the figures as vertical lines. Our results for the AFG method 
are consistent with those reported in [18] . 

Our second test network is a small, unweighted network representing Zachary's 
karate club [2_Q] , which has often been used as a "testbed" for community detection. 
Modularity optimization is known to yield four communities, whereas this club was 
observed to split into two communities. In [18] , the authors demonstrated that AFG 
method is able to find exactly those communities (by using the weighted version 
of this network). Results for the unweighted network in Fig. [TJB) show that both 
methods give similar results and are able to detect the two communities. A closer 
inspection shows that the communities correspond to the split which eventually 
happened (except for one individual classified differently by the RB method). 

Our third test network is weighted, being larger than the previous examples (986 
nodes), and has a more complex community structure, Fig. [3a). The average degree 
of this network is (A;) = 6 and it has been generated with a model designed to resem- 
ble real, weighted social networks. Visually, the communities are less apparent than 
in the previous test networks, although it can be seen that there are dense groups 
of nodes with strong internal links, connected by weaker links. Applying the clique 
percolation method [H [HJ H2] to this network using clique size 4 yields communi- 
ties whose sizes vary from 4 nodes (20 communities) to 43 nodes (1 community). 
Because the network is weighted, we have used the a weighted Hamiltonian instead 
of © for the RB method Q7|. Results in Fig. HJb) show that no clear "stable" 
regions of the tuning parameters with a constant number of communities are appar- 
ent. One possible explanation is that this is due to quite non-uniform distribution 
of community sizes, which may result in large communities continuously being split 
into smaller ones as the tuning parameters are increased. A similar situation could 
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Figure 2. (Color online) A weighted test network having 986 nodes. Link colors vary from 
blue (weak) to red (strong), Number of communities for the network as a function of the tuning 
parameters. Note that we have limited the number of communities to 300. 



occur for many large real- world networks. However, by using small values of 7 and 
r it might be possible to study the large-scale community structure, such that the 
network is split into a small number of large communities. 

We have discussed the limited resolution of community detection methods where 
a global energy-like quantity is optimized, by focusing especially on two methods 
(RB and AFG) where the resolution can be adjusted using a tuning parameter. Al- 
though the tuning parameters of these two methods give rise to qualitatively similar 
changes in resolution, analytic derivations show that their effect on the resolution 
limit is somewhat different. These two methods have also been numerically tested 
by using simulated annealing, with the result that in small test networks, stable re- 
gions of tuning parameter values, where the number of communities is constant, can 
easily be found. These can be viewed to reflect "optimal" communities. However, 
on a large, weighted test network, where the clique percolation method indicates a 
broader distribution of community sizes, no such regions are apparent. 
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